Spatial & Temporal Associations Between Drug Treatment Center Operation and Fatal Opioid Overdoses in Cook County, IL, 2014-2024

BMIN503/EPID600 Final Project

Author

Jamie Benson


library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.1     ✔ stringr   1.5.2
✔ ggplot2   4.0.0     ✔ tibble    3.3.0
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.1.0     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(kableExtra)

Attaching package: 'kableExtra'

The following object is masked from 'package:dplyr':

    group_rows
library(leaflet)
library(tmap)
library(ggthemes)
library(flextable)

Attaching package: 'flextable'

The following objects are masked from 'package:kableExtra':

    as_image, footnote

The following object is masked from 'package:purrr':

    compose
library(sf)
Linking to GEOS 3.13.0, GDAL 3.8.5, PROJ 9.5.1; sf_use_s2() is TRUE
# Load targets
library(targets)

1 Overview

Give a brief a description of your project and its goal(s), what data you are using to complete it, and what two faculty/staff in different fields you have spoken to about your project with a brief summary of what you learned from each person. Include a link to your final project GitHub repository.

2 Introduction

Opioid use disorder (OUD) represents a persistent public health crisis in the United States, contributing substantially to national mortality, morbidity, and healthcare expenditure. Recent estimates found a 289% increase in unintentional opioid-related mortality between 2011 and 2021, with particularly pronounced effects observed during the COVID-19 pandemic. During this period, years of life lost (YLL) from opioid-related causes increased by 62.9%, with adolescents aged 15-19 experiencing a 160% increase in YLL, from 1.5 to 3.9 years per 1,000 individuals. (Citations are in Zotero, I just need to bring them in with Bibtex)

Treatment approaches for OUD encompass both inpatient detoxification and outpatient services, including medication-assisted treatment (MAT) with partial opioid agonists such as buprenorphine, naltrexone, and methadone.Despite demonstrated efficacy, significant treatment gaps persist, with approximately 40% of U.S. counties lacking authorized buprenorphine providers as of 2018. Access disparities are particularly evident along the rural-urban continuum, where facility density and treatment modality availability vary substantially.

This study aims to examine the temporal and spatial associations between drug treatment center operations and incident neighborhood opioid overdose fatalities in a large U.S. metropolitan area. Specifically, I will investigate: 1) the relationship between facility opening/closure events and changes in following year areal overdose rates, 2) spatial spillover effects on adjacent areas, and 3) the correlation of these effects with community social vulnerability indicators.

3 Methods

The analysis pipeline was run using the R: Targets platform given the large volume of moving parts. This allows for flexible iteration and for me to ensure all components are up to date, and that I am not re-running things unnecessarily if a given component of the analysis is up to date.

The pipeline is summarized and visualized below:

tar_manifest()
# A tibble: 49 × 3
   name           command                                                pattern
   <chr>          <chr>                                                  <chr>  
 1 keys_crosswalk "read_csv(\"data/working_data/keys_crosswalk.csv\")"   <NA>   
 2 bg_poly        "get_acs(geography = \"cbg\", state = \"IL\", county … <NA>   
 3 buffers        "seq(60.96, 804.67, 60.96)"                            <NA>   
 4 city_names     "load_city_names(file = \"data/us_cities_states_count… <NA>   
 5 ods_pull       "load_chicago_od_data()"                               <NA>   
 6 keys           "import_keys(directory = \"data/dtc_keys\")"           <NA>   
 7 years          "c(2005:2023)"                                         <NA>   
 8 csv_df         "csv_clinics()"                                        <NA>   
 9 basemap        "leaflet_basemap()"                                    <NA>   
10 acs_mi         "load_acs(vintage = 2023, countyname = \"Cook\", stat… <NA>   
# ℹ 39 more rows
tar_visnetwork()
+ pdf_df declared [19 branches]
+ pdf_split declared [19 branches]
+ candidate_pairs declared [1723 branches]
+ chunk_search_poly declared [13 branches]
+ nearby_ods declared [13 branches]
+ chunk_distances declared [13 branches]
+ att_open_buffer declared [13 branches]
tar_make()
+ pdf_df declared [19 branches]
+ pdf_split declared [19 branches]
+ candidate_pairs declared [1723 branches]
+ chunk_search_poly declared [13 branches]
+ nearby_ods declared [13 branches]
+ chunk_distances declared [13 branches]
+ att_open_buffer declared [13 branches]
✔ skipped pipeline [495ms, 1855 skipped]

Drug Treatment Centers The National Survey of Substance Abuse Treatment Services (N-SSATS) is conducted by the Substance Abuse and Mental Health Services Administration (SAMSHA) on an annual basis. Facilities which voluntarily responded and agreed to be represented are included in the directory. For each facility, information on their service offerings, licensure, payment forms and funding structure, street address, and contact information are collected.

I obtained N-SSATS directory listings in PDF format for the years 2005-2021, and CSV format for 2022-2024. Prior to 2005, the text was not directly embedded in each PDF document, and was instead created through optical character recognition, introducing significant barriers to data cleaning. Data ingestion from each PDF was performed using the R package readPDF.

Similar to other studies (Cantor 2022), I linked facilities longitudinally across years using geolocation (ArcGIS Pro, V.10.8.2; Esri) and probabilistic linkage. I conducted this “fuzzy” linkage by generating Jaro-Winkler string similarity scores for each field of each clinic-year observation (CITE). Pairwise comparisons were blocked within postal near-neighbor clusters of 50 clinics for computational efficiency. The sub-scores for geocoded X/Y coordinates, street address, and clinic name were weighted at 4,3, and 1 respectively, then summed into a total score. Pairwise comparisons with similarity scores above the 95th percentile were considered true matches, and above the 90th percentile considered likely matches: among those, the highest similarity scoring candidate was chosen. Matching was conducted using the reclin2 package. DTCs were occasionally listed under two different entities providing different services, or on different floors of the same street address. As our unit of analysis is the point location of each DTC, centers meeting these criteria (and which were geolocated within 200ft of one another) were represented as a single site, using the most common lat/lon coordinate pair in the group.

Overdose Fatalities Geolocated fatal overdoses were obtained from the Cook County, IL, medical examiner’s office, which provides a live API feed with all decedents under the jurisdiction of the medical exmainer. This includes a near-census of fatal opioid overdoses occurring out of the hospital. Opioid overdoses were identified using regex string comparison of known drug names contianed in primary and secondary cause of death fields, as well as the medical exmainer’s determination of “opioid involvement.”

Community Covariates I obtained and linked block-group level community socio-demographic variables from the American Communities Survey (ACS) 5-year estimates. Shapefiles and source tables were retrieved from the IPUMS National Historical Geographic Information System (NHGIS), and the U.S. Census Bureau via tidycensus.

ACS variables included median household income, race and ethnicity counts, sex, owner-occupied housing, female-headed households with children, educational attainment > 25 years of age, and total population. From these variables we calculated both the Neighborhood Deprivation Score (NDS) and the Index of Concentration at the Extremes (ICE), as described by Furr-Holden et. al. For multivariate models, DTCs were assigned the covariates of the block group containing them.

Primary Outcome The primary outcome of interest was the incidence of fatal overdose in the geographic area surrounding DTCs. Density calculations were performed using concentric rings centered around the point location of each DTC, with the density determined by dividing the count of overdoses within each ring by the total area of the ring, for each year. Ring sizes were selected at even Euclidean 100ft radius increments, up to 0.5mi, a common cutoff for “within walking distance” in a city.

The second measure we explored was the median Euclidean distance to an overdose from each DTC. We compare the discriminatory ability of these two measures in sensitivity analysis.

4 Results

Initial results:

This code chunk shows the number of clinics identified over the last 20 years in Cook County:

tar_load(clinic_year_summary)
tar_load(plot_clinic_year)
tar_load(plot_clinic_year_matrix)

flextable(clinic_year_summary[[1]])

year

n

official_count

error

2,005

11,060

13,371

17.283674

2,006

10,777

13,771

21.741340

2,007

10,955

13,648

19.731829

2,008

11,267

13,688

17.687025

2,009

11,259

13,513

16.680234

2,010

11,248

13,339

15.675838

2,011

11,278

13,720

17.798834

2,012

11,542

14,311

19.348753

2,013

11,697

14,148

17.324003

2,014

11,465

14,152

18.986716

2,015

11,928

13,873

14.020039

2,016

11,714

14,399

18.647128

2,017

12,031

13,585

11.439087

2,018

12,114

14,809

18.198393

2,019

13,426

15,961

15.882464

2,020

14,170

16,066

11.801320

2,021

14,281

16,448

13.174854

2,022

12,176

12,176

0.000000

2,023

12,744

12,744

0.000000

2,024

12,200

13,074

6.685024

plot_clinic_year

plot_clinic_year_matrix

Here is a map of them:

tar_load(basemap)
tar_load(clinics_open)

basemap %>%
  addCircleMarkers(data = clinics_open %>% st_transform('+proj=longlat +datum=WGS84'),
    stroke = FALSE, fillOpacity = 0.5, radius = 2
  ) 

Here is a map showing all the layers I am using for analysis overlaid! It’s a lot. Yellow is clinics. Red is overdoses. ICE score is a chloropleth behind it all on the block groups.

tar_load(all_layer_map)

all_layer_map

This shows a map of the overdose rate per 1,000 residents in each block group.

tar_load(od_rate_map)

od_rate_map
Warning in scale_fill_viridis_c(trans = "log", breaks = c(1, 5, 10, 20, :
log-2.718282 transformation introduced infinite values.

Lastly, these show tables of VERY preliminary results from two difference in difference models, one looking at the estimated change in density of overdoses per square kilometer within a given buffer size, and one looking at the change in distance to an overdose, both following the opening of a clinic.

tar_load(att_open_buffer)
tar_load(att_open_dist)

flextable(att_open_buffer)

estimate

se

ci_low

ci_high

buffer

outcome

density

direction

0.017597323

0.03694192

-0.054808847

0.09000349

60.96 [m]

od_count

true

Opening

0.040869708

0.02260681

-0.003439646

0.08517906

121.92 [m]

od_count

true

Opening

0.030433161

0.06228279

-0.091641117

0.15250744

182.88 [m]

od_count

true

Opening

-0.037311412

0.14185131

-0.315339971

0.24071715

243.84 [m]

od_count

true

Opening

0.025544710

0.07828499

-0.127893875

0.17898329

304.80 [m]

od_count

true

Opening

-0.013846136

0.08938742

-0.189045488

0.16135322

365.76 [m]

od_count

true

Opening

0.067683001

0.06488514

-0.059491879

0.19485788

426.72 [m]

od_count

true

Opening

-0.136387958

0.14628646

-0.423109425

0.15033351

487.68 [m]

od_count

true

Opening

0.003090967

0.20591617

-0.400504735

0.40668667

548.64 [m]

od_count

true

Opening

-0.048004964

0.24422647

-0.526688846

0.43067892

609.60 [m]

od_count

true

Opening

-0.073152150

0.20858235

-0.481973559

0.33566926

670.56 [m]

od_count

true

Opening

-0.375970704

0.26060711

-0.886760634

0.13481923

731.52 [m]

od_count

true

Opening

-0.007850986

0.12081050

-0.244639573

0.22893760

792.48 [m]

od_count

true

Opening

flextable(att_open_dist)

estimate

se

ci_low

ci_high

buffer

outcome

density

direction

10.62999

47.27229

-82.0237

103.2837

548.64 [m]

median_distance

false

Opening

5 Conclusion

This the conclusion. The Section 4 can be invoked here.